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Abstract 

In this paper, we take a control-theoretic approach to answering some 
standard questions in statistical mechanics. A central problem is the re- 
lation between systems which appear macroscopically dissipative but are 
microscopically lossless. We show that a linear macroscopic system is 
dissipative if and only if it can be approximated by a linear lossless mi- 
croscopic system, over arbitrarily long time intervals. As a by-product, 
we obtain mechanisms explaining Johnson-Nyquist noise as initial uncer- 
tainty in the lossless state as well as measurement back action and a trade 
off between process and measurement noise. 



1 Introduction 

The derivation of thermodynamics as a theory of large systems which are micro- 
scopically governed by fundamental laws of physics (Newton's laws or quantum 
physics) has a large literature and tremendous progress for over a century within 
the field of statistical physics. See for instance [1] for a physicist's account of 
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statistical mechanics. Nevertheless, from a control theorist's perspective, there 
are inadequacies in the existing treatment both with the level of mathematical 
rigor, and the applicability to far-from-equilibrium systems, particularly when 
subject to complex regulatory mechanisms. Substantial work has already been 
done in formulating various results of classical thermodynamics in a more math- 
ematical framework (e.g. [2-6] is a small sample), but statistical mechanics has 
received much less comparable attention. This paper focuses on simple prob- 
lems in statistical mechanics in which the issue of rigor can be pursued, but 
aims also to set the stage for broader applicability. 

In particular, we construct a simple and clear control-theoretic modeling 
framework in which the only assumptions on the nature of the physical systems 
are conservation of energy and causality and all systems are of finite dimension 
and act on finite time horizons. We construct high-order lossless systems that 
approximate low-order dissipative systems in a systematic manner, and prove 
that a linear model is dissipative if and only if it is arbitrarily well approxi- 
mated by lossless causal linear systems over an arbitrary long time horizon. We 
show how the error between the systems depend on the number of states in 
the approximation and the length of the time horizon. Since human experience 
is based on a finite window of space and time, we argue that no human can 
directly distinguish between a low-order macroscopic dissipative system and its 
high-order lossless approximation. 

The lossless systems studied here are consistent with classical physics, since 
they conserve energy, are causal, and are time reversible. Uncertainty in their 
initial state gives a simple explanation of the Johnson-Nyquist noise that can 
be observed at a macroscopic level. We also derive some well-known results 
from statistical mechanics, including the fluctuation-dissipation theorem. As a 
further application, we study the implications of these results for an idealized 
measurement device, and exhibit a back-action effect, that there is no precise 
measurement without perturbation on the measured system, that arises natu- 
rally in a purely classical setting. 

We hope this paper is a step towards building a framework for understanding 
fundamental limitations in control and estimation that arise due to the physical 
implementation of measurement and actuation devices. We defer many impor- 
tant and difficult issues here such as how to actually model measurement devices 
realistically. It is also clear that this framework would benefit from a behavioral 
setting [7]. However, for the points we make with this paper, a conventional 
input-output setting with only regular interconnections is sufficient. Aficionados 
will easily see the generalizations, the details of which might be an obstacle to 
readability for others. Perhaps the most glaring unresolved issue is how to best 
motivate the introduction of stochastics. In conventional statistical mechanics, 
a stochastic framework is taken for granted, whereas we aim to explain if and 
when stochastics arise naturally, and in this we are only partially successful. 

The organization of the paper is as follows: In Section [21 we define the 
class of linear lossless/causal systems. In Section [3l wc derive lossless/causal 
approximations of memoryless dissipative systems and obtain Johnson-Nyquist 
noise. In Sections |4] and [5l we discuss interconnections of systems and introduce 



an idealized measurement device with back action. Finally, in Section [S] we 
generalize the procedure from Section [3] to a class of linear dissipative systems 
with memory, and in Section [7] obtain the fluctuation-dissipation theorem. 



2 Lossless/Causal Linear Systems 



In this paper, we consider linear systems in the form 



x{t) = Jx{t) + Bu{t), x{t) e M", 
y{t) = B^xit), 



(1) 



where J = —J^ and {J,B) is controllable. It is assumed that the input u{t) 
and the output y{t) are scalars. We define the internal energy of ^ as 



We argue these systems have desirable "physical" properties. These properties 
are losslessness and causality. 

Lossless [8,9] means that the internal energy satisfies 



where is the work rate on the system. If there is no work done on the 
system, w{t) = 0, then the internal energy U{t) is constant and conserved. If 
there is work done on the system, w(t) > 0, the internal energy increases. The 
work, however, can be extracted again, w(t) < 0, since the energy is conserved 
and the system is controllable. Conservation of energy is a common assumption 
on microscopical models in statistical mechanics [1]. 

Causal here means that there is no direct term between the input u and the 
output y. This means that there is no instantaneous reaction of the system. 
Also this is a reasonable physical assumption. 

Definition 1. Systems {Ip that satisfy the above assumptions are simply called 
lossless/causal systems. 

Later we will seek approximations of dissipative systems in the class of loss- 
less/causal systems. 

The lossless/causal systems are rather abstract but have properties that we 
argue are reasonable from a physical point of view, as illustrated by the following 
example. 



U{x{t)) ^ - 



x{tfx{t). 



dU{x{t)) 
dt 



x{t)'^x{t) = y(t)'^u{t) ^ 



w{t), 



Example 1. Consider the inductor- capacitor circuit in Fig.l^ Let the input u 
be the current through the current source, and the output y the voltage across 




Figure 1: Inductor-capacitor circuit. 
the current source. Then a model is given by 

/ -1/VcTl; \ fi/VCi\ 

x^il/Vcn^ -1/^l;C^\x+\ \u, 

\ / V / 

y = {l/VCi 0) X, = (VCiWi Vl^ii \/a2W2) 

U = ix'^a; = ^(Ci^^i + ^1*1 + C^vl), w ^ yu ^ vii, 
and it satisfies Definitions^ 

3 Lossless / Causal Approximations of Dissipative 
Memoryless Systems 

In this section, we see how dissipative models, models where energy disappears, 
can be approximated by the lossless/causal models. We start with simple mem- 
oryless models, which give rise to heat baths and Johnson-Nyquist noise. 

3.1 Dissipative memoryless systems 

Many times macroscopic systems, such as resistors, can be modeled approxi- 
mately by simple input-output relations 

y{t) = ku{t), (2) 

where A: is a scalar. If fc > 0, the system is dissipative since we can never extract 
any work. This is because the work rate is always positive 

w{t) = y{t)u{t) ^ ku{tf > 0, 

for all t and u. Hence, ([2]) is neither lossless nor causal. Next, we show how we 
can approximate ^ arbitrarily well with a lossless/causal system over finite, 
but arbitrarily long, time horizons. 



First, choose a time interval of interest, [0,r], and rewrite ^ using a con- 
volution integral 



k6{t — s)u{s)ds^ 



(3) 



when u is at least continuous and has compact support on [0,t], and i5 is the 
Dirac distribution. Let us call r the recurrence time of the model. The recur- 
rence time interval contains all the time instants where we perform experiments 
on the model, and can be very long. Over this time interval, the system is 
equally well modeled by the impulse response 



K{t) = kS{t^l2T) 



which is a 2r-periodic distribution. K{t) can be expanded in a Fourier series 
with convergence in the sense of distributions: 



k 



2t ^ — ' r 

1=1 



Ek 
— cos lujQt, 



where wq = tt/t. Define the truncated Fourier series by 

N 

2r 



KN[t) = h > — COS i Wot- 



(4) 



We can split Kjv(t) into its causal and anti-causal parts: 

n'i^it) = 0, t < 
n'i^it) = 0, t > 0. 

We can realize the causal part n^it) as the impulse response of a lossless/causal 
system of order 2N + 1 with the matrices 
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Bn ~ Cjj. 









flN = diag{wo, 2a;o, . . . , Nuq}, 



V2 



(5) 



We can realize K'}^{t) with a similar system by reversing time. That the series ^ 
converges in the sense of distributions means that for all smooth u of compact 
support on [0, r] we have that 



ku{t) 



lim 



k"}^ {t — s)u{s)ds + / K']^{t — s)u{s)ds 



A closer study of the two integrals reveals that 

r 1 

lim / K'}^{t - s)u{s)ds — -ku{t+), 

r 1 

lim / K'j^{t — s)u{s)ds = -ku{t~), 
n-^coJq 2 

because of the anti-causal/causal decomposition. Hence, since u is continuous, 
we can model y(t) — ku(t) with only the causal part if we normalize the causal 
part with a factor two. 

We identify the lossless/causal approximation of ^ with a linear operator 
Kn:C\0,t)^C\0,t): 

VNit) = KNu{t) : yN{t) = / 2n'}^{t - s)u{s)ds. 

It is realized by the triple {Jn, V^Bjy, \/2Cn). We can bound the approximation 
error as seen in the following proposition. 

Proposition 1. Assume that u e C^(0,r) and it(0) = 0. Let y{t) = ku{t), 
k > 0, and yAf(i) = KNu{t). Then 

2fcr 

\y(t) - yN{t)\ < {\u{t)\ + 1^(0)1 + \\u\\L,m) , 

for t in [0, r]. 
Proof. We have that 

~ 2fc /■* 

y{i)-VN{t)^ 2Z — / cos/wo(i- s)"(s)rfs, t e [0, r]. 
l=N+l •'^ 

We have changed the order of summation and integration because this is how 
the value of the series is defined in distribution sense. We proceed by using 
repeated integration by parts on each term in the series. We have 

* J / \ / \ 1 /"* sin/wo(i — s) . , , , 
cos iLOviyt — s)u[s)ds = I uysjas 



Jo 



lUJQ 



1 .,, cos lujot . f cos luj(){t — s) 



-12 — rr^^(^)~ jr^ u{s)ds. 

Hence, we have the bound 

°° 1 / rt 



1 / r 

=N+1 \ "'0 

Since X^/^at+i 1/^^ — '^^ can establish the bound in the proposition. □ 



y{i)-VN{t)\<'^ 75^ ( l«(t)l + |u(0)|+ / \u{s)\ds 



The proposition shows that by choosing N sufficiently large, we can approx- 
imate the memoryless model ([2]) as well as we like with a lossless/causal system, 
if inputs are smooth. It is a reasonable assumption that inputs, such as volt- 
ages, are smooth since we usually cannot change them arbitrarily fast due to 
physical limitations. Physically, we can think of 2A^-|- 1 as the number of degrees 
of freedom in a resistor. This is usually a number with the size of Avogadro's 
number, N « 10^^. Then the recurrence time r can be very large without a 
significant error. This explains how the dissipative model ^ is consistent with 
a physics based on energy conserving systems. 

3.2 Initial conditions in 

The general solution to the lossless/causal approximation Kn is 

yN{t)^^Ble^«*x{Q)+ [ 2K%{t - s)u{s)ds, (6) 

^0 

where Jn and Bn are defined in ((5]), and x(0) is the initial state. It is the second 
part of the solution that approximates kuit). The first part, the homogeneous 
solution, is not desired in the approximation, but is always present for a linear 
dynamical system. Next, we study the influence of this term. 

Proposition [1] suggests that we will need a system of incredibly high order 
to approximate the dissipative system ([2]) on a reasonably long time horizon. 
When dealing with systems of such extremely high dimensions, it is reasonable 
to assume that the exact initial state a;(0) is not known. Therefore, we will take 
a statistical approach to study its influence. 

We have that 

EyAr(i) = %/2B^e'^"*Ea;(0) + / 2K%{t ~ s)u{s)ds, 

Jo 

if the input u is deterministic and known. The covariancc function for yAr(i) is 
then 

Ry^is,t) ^ EiVNit) - EyAr(t)][2/jv(s) - Ej/jv(s)] = 2Ble^«' Xe-'"' , (7) 

where X is the covariance of the initial state, 

X ^ E[a;(0) - F,x{0)][x{0) ~ Ea;(0)]^. (8) 

In Section 13. 4[ we discuss how it is reasonable to choose X. The arguments 
are information theoretical and physical in nature. Both arguments result in 
an equipartition-type statement that result in the concept of temperature. For 
now, let us only define the notion of temperature of a lossless/causal system. 

Definition 2 (Temperature). A lossless/causal system with deterministic input 
has temperature T (T is scalar) if 



Ry{s,t) = T ■ B^e^'^^-'^B. 



If X commutes with J and admits Sat as an eigenvector with eigenvalue T, 
([7]) satisfies Definition [2] and we have (in the sense of distributions) 



Ry^{s,t) -^2Tk5{t~ s), t,se[0,T], N-^oo. 



(9) 



A stochastic signal with this property is called white noise. 
3.3 Johnson-Nyquist noise 

^From Proposition [U and ^ we obtain the following proposition. 

Proposition 2. In the limit when N —>■ oo, the lossless/causal system Kiq, 
given by 0), converges to 



when it has temperature T . The signal w{t) is stochastic white noise of unit 
intensity. The input u{t) should satisfy the assumptions of Proposition]^ 

Definition 3 (Heat bath). A system ilO\) is called a heat bath of strength k, 
temperature T , and recurrence time r. 

Hence, in the limit, the uncertainty in the initial state of the microscopic 
lossless/causal model is transformed into white noise added to the output 
of the macroscopic model ^ . This is a generalization of Johnson-Nyquist noise 
of resistors, see [10, 11]: It is a fact that careful measurements of the voltage 
across a resistor reveal that there is noise that depends on the resistance and 
temperature. Usually this noise is modeled by stochastic white noise. The 
noise is often explained using methods from statistical mechanics and circuit 
theory. See, for example, [1]. Here we obtain exactly the same result using 
lossless/causal systems and a suitable definition of temperature. 

Remark 1. That Proposition\^indeed leads to the standard form of the Johnson- 
Nyquist noise of a resistor can be seen as follows: We have v = Ri from Ohm 's 
law. Assume that i = and study the variance of v{t) through a low-pass filter 
of bandwidth B. Then we have, since |-Ru,(jtj)p — 1 (white noise), 



which is usually how Johnson-Nyquist noise is presented. Notice that Boltz- 
mann's constant here should be included in the temperature T . It is also inter- 
esting to notice that the factor two in the noise intensity 2TR in our derivation 
originates from the causal/ anti- causal decomposition in the construction of K^. 
A very different argument is used in the derivation in [1]. 



y^{t) = ku{t) + V2fkw{t), t e [0, r] 



(10) 




3.4 Equipartition of energy 



In this section, we discuss how the covariance of the initial state a::(0) of K]y, 
defined in ([5]), should be chosen. This discussion leads up to the definition of 
temperature, Definition [21 The first argument is information theoretical, and 
the second argument has a more physical flavor. As mentioned in the intro- 
duction, how to properly motivate the introduction of the stochastic element is 
not easy. Here we just give two arguments whose consequences are compatible 
with macroscopic observations, if Johnson-Nyquist noise is modeled by stochas- 
tic white noise. Neither of the arguments is entirely convincing, and we hope to 
return to these issues elsewhere. 

MaxEnt argument 

The first argument is based on the MaxEnt principle, due to Jaynes [12, 13]. 
This means that we should assign the distribution of a;(0) that maximizes the 
Shannon entropy of the distribution subject to all known constraints. The 
procedure is justified because it leads to the least biased guess. Assume that 
the expected internal energy of the initial state is E: 

E = F,^x{Ofx{0) = ^I]xiOf'ExiO) + ^TvX. 

Maximization of the Shannon entropy subject to this constraint leads to a dis- 
tribution of a:(0) that is Gaussian with mean zero and with covariance matrix 

2E 

If we define the temperature T as 2E/{2N + 1) and use this X in ([7]), we see 
that the covariance function of tjn satisfies the requested relation in Definition[51 
This means that the energy is distributed equally between all degrees of freedom. 
We have equipartition. The temperature is the expected amount of energy (up 
to a factor two) of each degree of freedom. This coincide with the usual notion 
of temperature in physics. 

White noise argument 

Assume that the K^f had temperature zero a long time back, i.e., x{—h) — 
where /i is a large number. We will be more precise about the size of h later. We 
start our experiment at time t — and wonder what a reasonable assumption 
on the initial state x(0) is. Let us now assume that K^f has been subject to 
low-intensity white noise over the time interval [—h, 0], say 

Eu(i)u(s) = -(5(t- s), Eu(<) = 0, 
h 

where i is an intensity constant. One can say that Kn has been weakly con- 
nected to an even larger heat bath for a long time. 



In the end, we want to compute as defined in ([7|), and it is of interest 
to compute X. We have 



X = Ea;(0)a;(0)^ 



2i 



2i r° k 

h J^h T 



-Jns 



/ costJos \ 
cos 2uJas 



sm loqs 
sin 2ll}()S 



ds 



/ cos WoS \ 
cos 2uJos 



smujQS 
sin 2wos 



ds. 



\ I/V2 J \ I/V2 J 



Notice that ii h = 2t we have that 

X - 



ik 



this is the amount of time the white noise needs to excite all the modes equally. 
When ft, > 2r we can use that 



lim — 

h — ^00 h 

lim — 

h — >oo 11 







1 



cosfc^os coslujQsds — —Sk-i 


sinkujQS cosIloqs ds = 0. 

-h 



Hence we have that X {ik/T)l2N+i, h — > 00, and from ^ we have 



Ry„{s,t)^2Ble'^'''Xe-'^'B_ 



ik 



N 



2Blf^"^'-'^B 



N- 



According to Definition [21 the temperature of is T — ik/r. 

4 Interconnections 

Definition 4. The physical interconnection of the lossless/causal system (Ji, Bi, Bf) 
to the lossless/causal system {J2, B2, B2) is given by 



d_ 

dt 



Xl 




X2 





Jl 

B2BI 



y = B^xi. 



-b^bT 

J 2 ' 









'Bi' 




X2 


+ 






The physical interconnection is still lossless/causal. The interconnection 
makes physical sense if one studies interconnections of circuit or mechanical 
models, for example. It is also a neutral interconnection, as defined in [8]. 
Motivated by this definition, and that we in Section [3] showed that the loss- 
less/causal system {J^, ^/2Bn, V2Cn) converges to a heat bath, we make the 
following definition. 



Definition 5. The physical interconnection of the lossless/causal model (J, B, B^) 
to a heat bath of strength k, temperature T , and recurrence time t, is given by 



x{t) = ( J - kBB'^)x{t) + Bu{t) - BV2kfw{t) 

y{t) = B^xit), ^^^^ 

fort G [0, r], where w is stochastic white noise of unit intensity. 

Notice that even though (J, B, B^) is lossless, when connected to the heat 
bath, pT|) looks dissipative since the eigenvalues of J — kBB^ have negative 
real parts. 



5 Back Action of Linear Measurements 

As a simple application of the results in Scction[3]and the definitions in Sectional 
consider the problem of measuring the output y{t) of the lossless/causal system 
{J,B,B'^). For this purpose, we define an idealized measurement device 

yra{t) = k^y{t), (12) 

where km > is a scalar, and the signal ym{t) is such that we can read it out 
perfectly. With such a measurement device, we can also read out the output 
y{t) = ym{t)/km perfectly. 

Now we construct a slightly less idealized measurement device by replacing 
(fT^ by a lossless/causal approximation of (dH). This is a more physical device, 
as argued before. According to Section [31 we obtain 

ym{t) = hny{t) + \/2k,nT„iW{t), (13) 

in the limit if the initial state of the measurement device is not perfectly known. 
Tm is the temperature of the device, and it is essentially a heat bath. If we 
make a physical interconnection of (J, B, B^) to (fT3l) . we obtain 

X{t) = {J - kmBB^)x{t) - B^2krnTmW{t), 

[W~ (14) 

y{t) ^ yUt)/k,n = B^x{t) + J ^w{t), 

using ([TH]) and Definition O where y{t) is an estimate of y{t). Acting on the 
(fTl)) we have 

process noise: p{t) = \/2kmTmw{t) 



measurement noise: 




The measurement device generates process noise and dissipation. This is called 
back action of measurements. This is a well-known phenomenon in quantum 



physics. Here we obtain a similar effect based on lossless/causal approximations 
and using physical interconnections. Also notice that it holds that 



Ep(i)m(s) = 2T^5{t - s). 



(15) 



The cross-covariance between process and measurement noise is independent of 
the amplification km of the measurement device. For large km, we get a good 
estimate of y, but on the other hand, the process noise gets large. Hence, there 
is a trade off. It is only the temperature T„j of the measurement device that 
controls trade off in p5|) . 

6 Lossless / Causal Approximations of Dissipative 
Systems with Memory 

In this section, we generalize the procedure from Section[3]to dissipative systems 
that have memory. We consider strictly stable linear causal systems G with 
impulse response g. Their input-output relation is given by 



The system is dissipative with respect to the work rate w{t) — y{t)u{t) if 



"'0 

for all T > and admissible u{t). An equivalent condition, see [14], is that the 
transfer function is positive real 



Here gi^juj) is the Fourier transform of g{t). 

The following theorem shows that the system is dissipative if and only 
if it can be approximated arbitrarily well by a lossless/causal system over any 
finite time horizon [0,r]. 

Theorem 1. Assume that G is a linear (causal) system with impulse response 
g, such that g Cz Li D L2{0, oo) and g' G ii(0, oo). Then G is dissipative if and 
only if for all e > and r > there is a lossless/causal linear system Gr with 
impulse response gr such that 




(16) 




Reg(ja;)>0 for all u. 



(17) 



Wg - 9t\\l.,[o,t] < e- 



(18) 



Proof. See appendix \X[ 



□ 



Notice that Theorem[l]shows that a large class of dissipative systems (macro- 
scopic systems) can be approximated by the lossless/causal systems we intro- 
duced in Section [21 



7 Fluctuation-dissipation Theorem 

If a lossless/causal system satisfies Definition [2l then by definition we have 

Ry{s,t) ■ B^e^^^-^'^B. 

This can be said to be the fluctuation of the system. The response of the 
lossless/causal system to an impulse u{t) = 5{t) is 

B'^e-'^B. 

If the lossless/causal system approximates a dissipative system over [0,t], see 
Theorem [1] then the impulse response decays over this time interval. This 
represents the dissipation of the system. The expressions of the fluctuation and 
dissipation are equal up to a constant, the temperature. This is a property 
that can be observed in physical systems close to equilibrium (and hence can 
be linearized). 

8 Conclusions 

In this paper, we defined the class of lossless/causal systems and used them to 
approximate dissipative systems. We obtained an if and only if characterization 
and gave explicit error bounds that depend on the time horizon and the order of 
the approximations. When applied to memoryless models, we saw that Nyquist- 
Johnson noise (macroscopic measurable noise) can be explained by uncertainty 
in the initial state of a lossless/causal approximation of very high order. We 
also saw that using these techniques, it was relatively easy to obtain a back- 
action effect of measurements. This gave rise to a trade off between process and 
measurement noise. 
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A Proof of Theorem [T] 

We first show the 'if direction. Assume first the opposite: That there are 
lossless approximations that satisfy (fT8|) even though G is not dissipative. If G 
is not dissipative, we can find an input u{t) over a finite interval [0, T] such that 



I y{t)u{t)dt = -Ki < 0, 

JQ 



i.e., we extract energy from G even though its initial state is zero. Call ||m||lj[o,t] 
K2 and ||it||L2[o,T] = K3. For any t > T and e > we thus have 

T 

{yr{t) - y{t))u{t)dt < eK^K^, 



by the assumption that lossless approximations Gr exist and using the Cauchy- 
Schwarz inequality. But the lossless approximation satisfies 

^ 1 

yr{t)u{t)dt = -Xr{T)^XriT), 

since Xr{Q) = 0. Hence, 

- / y{t)u{t)dt = K^< eK^Ks - ]-Xr{TfxAT) < 6X2X3. 
Jo ^ 

But since e can be made arbitrarily small, this leads to a contradiction. 

To prove the 'only if direction we will explicitly construct a Gr that satisfies 
(dH]). We first need to make some definitions. Let 

TT 

that is finite when g,g' G L\. Also define 

/•oo 

8{t) ^ / \g{s)\ds 



that is a continuously decreasing function that satisfies limt^oo'^(^) = 0. We 
will need that the recurrence time r is such that 

<5(r) < 6V(8C). (19) 

If the chosen r does not satisfy this relation, we can without loss of generality 
increase it to the smallest r that satisfies this bound. That this has been done 
will be assumed in the following. 

The model Gt we construct will be based on a truncated version of the 
impulse response gN.rit) where 

flo kirt 
gN,T(t) ^ — + cos , te[0,T\, 

k=i ''' 



ak 



— / g(t) cos dt 

T Jo ' T 

N N 



M ii2 2 \ ^ 2 \ ^ 2 

\\9N,r\\L2lO,T] = 4«0 + 2 «fc < 2 2^ Ofc. 

fe=l k=Q 

Assume that r is fixed as above. Next pick the smallest N such that 

llff-5W,r||L2[0,r] < |- (20) 

Such an N always exist since g G L2 and the cos-terms are a basis in L2[0,t]. 



flfe = -RegN,r ( j — 



Define 



and notice that 



We have that 

|Reg(ja;) - Regjv,r(jw)| = Re 

< II.9||l,[.,oo) = Sir) < eV(8C). 
Since, Reg{juj) > for all uj by pT|) . we have 



(21) 



We will need a second bound on Uk that bounds the rate of decay to zero. We 
have 



a-k 



— / g[t) cos at 

T Jo ' T 



T knt 
g{t)—sm 

KTT T 



JO 



, T . kirt 
J (t)- — sm at 

KTT T 



and thus 



C 

kfel < — , 



independent of r. Together, (PT|) and ([^ give 

.2 ^- 



Ofc > max 



4Cr' A: 



, — — > for all k. 



(22) 



(23) 



Next, define 

where g^ ^(t) contains all the terms in gN,T{t) with strictly negative Fourier 
coejficients. Notice that g^ ^ can be realized with a linear lossless/causal system. 
Compare with fl]). We can bound the worst-case i2-norm of g^^. Using (I23p 
we have 



N 



\9n.t\\\2[o,t] - 2 X] 



at. 



r e 



E 



< 



2 16C2r2 



2 A:2 



£2 32C2t2 4C2t 2 



4 ' 



independent of how large N is. 

A lossless/causal approximation that satisfies the bound (flS]) is now given 
by grit) = 9% t(^)' where r and N were fixed in and (|^. This is because 

\\g - 9N.r\\L2[Q,r\ < llff " 57V,r 1 1 L2 [0,t] + || .gw,r - gN,T lU^fO^r] < + ^ = £• 



This conchides the proof. 



